The Minimum DAWG for All Suffixes of a String and Its Applications
نویسندگان
چکیده
For a string w over an alphabet Σ, we consider a composite data structure called the all-suffixes directed acyclic word graph (ASDAWG). ASDAWG(w) has |w| + 1 initial nodes, and the dag induced by all reachable nodes from the k-th initial node conforms with DAWG(w[k :]), where w[k :] denotes the k-th suffix of w. We prove that the size of the minimum ASDAWG(w) (MASDAWG(w)) is Θ(|w|) for |Σ| = 1, and is Θ(|w|2) for |Σ| ≥ 2. Moreover, we introduce an on-line algorithm which directly constructs MASDAWG(w) for given w, whose running time is linear with respect to its size. We also demonstrate some application problems, beginning-sensitive pattern matching, regionsensitive pattern matching, and VLDC-pattern matching, for which ASDAWGs are useful.
منابع مشابه
Space-Economical Construction of Index Structures for All Suffixes of a String
The minimum all-suffixes directed acyclic word graph (MASDAWG) of a string w has |w| + 1 initial nodes, where the dag induced by all reachable nodes from the k-th initial node conforms with the DAWG of the k-th suffix of w. A new space-economical algorithm for the construction of MASDAWG(w) is presented. The algorithm reads a given string w from right to left, and constructs MASDAWG(w) without ...
متن کاملComputing DAWGs and Minimal Absent Words in Linear Time for Integer Alphabets
The directed acyclic word graph (DAWG) of a string y is the smallest (partial) DFA which recognizes all suffixes of y and has only O(n) nodes and edges. We present the first O(n)-time algorithm for computing the DAWG of a given string y of length n over an integer alphabet of polynomial size in n. We also show that a straightforward modification to our DAWG construction algorithm leads to the f...
متن کاملSuffix Tree
SYNONYMS Compact suffix trie DEFINITION The suffix tree S(y) of a non-empty string y of length n is a compact trie representing all the suffixes of the string. The suffix tree of y is defined by the following properties: All branches of S(y) are labeled by all suffixes of y. • • Edges of S(y) are labeled by strings. • Internal nodes of S(y) have at least two children. • Edges outgoing an intern...
متن کاملPosition heaps: A simple and dynamic text indexing data structure
We address the problem of finding the locations of all instances of a string P in a text T , where preprocessing of T is allowed in order to facilitate the queries. Previous data structures for this problem include the suffix tree, the suffix array, and the compact DAWG. We modify a data structure called a sequence tree, which was proposed by Coffman and Eve for hashing [1], and adapt it to the...
متن کاملContracted Suffix Trees: A Simple and Dynamic Text Indexing Data Structure
We address the problem of finding the locations of all instances of a string P in a text T , where of T is allowed to facilitate the queries. Previous data structures for this problem include the suffix tree, the suffix array, and the compact DAWG. We modify a data structure called a sequence tree, which was proposed by Coffman and Eve for hashing, and adapt it to the new problem. We can then p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002